Improving Rule-Based Classifiers Induced by MODLEM by Selective Pre-processing of Imbalanced Data
نویسندگان
چکیده
In the paper we discuss inducing rule-based classifiers from imbalanced data, where one class (a minority class) is under-represented in comparison to the remaining classes (majority classes). To improve the ability of a classifier to recognize this class, we propose a new selective pre-processing approach that is applied to data before inducing a rule-based classifier. The approach combines selective filtering of the majority classes with focused over-sampling of the minority class. Results of a comparative experimental study show that our approach improves sensitivity for the minority class while preserving the ability of a classifier to recognize examples from the majority classes.
منابع مشابه
Selective Pre-processing of Imbalanced Data for Improving Classification Performance
The paper discusses problems of constructing classifiers from imbalanced data. Re-sampling approaches that change the original class distribution are often used to improve performance of classifiers for the minority class. We describe a new approach to selective pre-processing of imbalanced data which combines local over-sampling of the minority class with filtering difficult examples from majo...
متن کاملIntegrating Selective Pre-processing of Imbalanced Data with Ivotes Ensemble
In the paper we present a new framework for improving classifiers learned from imbalanced data. This framework integrates the SPIDER method for selective data pre-processing with the Ivotes ensemble. The goal of such integration is to obtain improved balance between the sensitivity and specificity for the minority class in comparison to a single classifier combined with SPIDER, and to keep over...
متن کاملThe Bagging and n2-Classifiers Based on Rules Induced by MODLEM
An application of the rule induction algorithm MODLEM to construct multiple classifiers is studied. Two different such classifiers are considered: the bagging approach, where classifiers are generated from different samples of the learning set, and the n-classifier, which is specialized for solving multiple class learning problems. This paper reports results of an experimental comparison of the...
متن کاملCombining rough sets and rule based classifiers for handling imbalanced data
The paper presents two rough sets based filtering approaches combined with rule based classifiers suited for handling imbalanced data sets, i.e., data sets where the minority class of primary importance is under-represented in comparison to the majority classes. We introduced two techniques to detect and process inconsistent majority cases in the boundary between the minority and majority class...
متن کاملChanging representation of learning examples while inducing classifiers based on decision rules
Decision rules induced from examples are used to predict classification of new objects. Improving classification accuracy may be obtained by changing the original representation of the learning data. Two different approaches to such a transformation are considered in this paper: selecting the subset of the most relevant attributes with the wrapper approach, and modifying the presence of some le...
متن کامل